Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Text Recognition in Multimedia Documents: A Study of two Neural-based OCRs Using and Avoiding Character Segmentation

Identifieur interne : 000036 ( France/Analysis ); précédent : 000035; suivant : 000037

Text Recognition in Multimedia Documents: A Study of two Neural-based OCRs Using and Avoiding Character Segmentation

Auteurs : Khaoula Elagouni [France] ; Christophe Garcia [France] ; Franck Mamalet [France] ; Pascale Sébillot [France]

Source :

RBID : Hal:hal-00867225

English descriptors

Abstract

Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based OCRs that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods.

Url:
DOI: 10.1007/s10032-013-0202-7


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

Hal:hal-00867225

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Text Recognition in Multimedia Documents: A Study of two Neural-based OCRs Using and Avoiding Character Segmentation</title>
<author>
<name sortKey="Elagouni, Khaoula" sort="Elagouni, Khaoula" uniqKey="Elagouni K" first="Khaoula" last="Elagouni">Khaoula Elagouni</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-121780" status="VALID">
<orgName>Orange Labs R&D [Rennes]</orgName>
<desc>
<address>
<addrLine>4 rue du Clos Courtel, 35112 Cesson-Sévigné Cedex, France</addrLine>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-300518" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-300518" type="direct">
<org type="institution" xml:id="struct-300518" status="VALID">
<orgName>France Télécom</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author>
<name sortKey="Garcia, Christophe" sort="Garcia, Christophe" uniqKey="Garcia C" first="Christophe" last="Garcia">Christophe Garcia</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-403930" status="VALID">
<orgName>Extraction de Caractéristiques et Identification</orgName>
<orgName type="acronym">imagine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-2003" type="direct"></relation>
<relation active="#struct-33804" type="indirect"></relation>
<relation active="#struct-126765" type="indirect"></relation>
<relation active="#struct-194495" type="indirect"></relation>
<relation name="- LYON" active="#struct-301232" type="indirect"></relation>
<relation name="UMR5205" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-2003" type="direct">
<org type="laboratory" xml:id="struct-2003" status="VALID">
<orgName>Laboratoire d'InfoRmatique en Image et Systèmes d'information</orgName>
<orgName type="acronym">LIRIS</orgName>
<desc>
<address>
<addrLine>Bâtiment Blaise Pascal - 20, avenue Albert Einstein - 69621 Villeurbanne cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://liris.cnrs.fr/</ref>
</desc>
<listRelation>
<relation active="#struct-33804" type="direct"></relation>
<relation active="#struct-126765" type="direct"></relation>
<relation active="#struct-194495" type="direct"></relation>
<relation name="- LYON" active="#struct-301232" type="direct"></relation>
<relation name="UMR5205" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-33804" type="indirect">
<org type="institution" xml:id="struct-33804" status="VALID">
<orgName>Université Lumière - Lyon 2</orgName>
<orgName type="acronym">UL2</orgName>
<desc>
<address>
<addrLine>86, rue Pasteur - 69007 Lyon</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lyon2.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-126765" type="indirect">
<org type="institution" xml:id="struct-126765" status="VALID">
<orgName>École Centrale de Lyon</orgName>
<orgName type="acronym">ECL</orgName>
<desc>
<address>
<addrLine>36 avenue Guy de Collongue - 69134 Ecully cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lyon.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-194495" type="indirect">
<org type="institution" xml:id="struct-194495" status="VALID">
<orgName>Université Claude Bernard Lyon 1</orgName>
<orgName type="acronym">UCBL</orgName>
<desc>
<address>
<addrLine>43, boulevard du 11 novembre 1918, 69622 Villeurbanne cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lyon1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="- LYON" active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMR5205" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Lyon</settlement>
<region type="region" nuts="2">Auvergne-Rhône-Alpes</region>
<region type="old region" nuts="2">Rhône-Alpes</region>
</placeName>
<orgName type="university">Université Claude Bernard Lyon 1</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lyon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Mamalet, Franck" sort="Mamalet, Franck" uniqKey="Mamalet F" first="Franck" last="Mamalet">Franck Mamalet</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-121780" status="VALID">
<orgName>Orange Labs R&D [Rennes]</orgName>
<desc>
<address>
<addrLine>4 rue du Clos Courtel, 35112 Cesson-Sévigné Cedex, France</addrLine>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-300518" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-300518" type="direct">
<org type="institution" xml:id="struct-300518" status="VALID">
<orgName>France Télécom</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author>
<name sortKey="Sebillot, Pascale" sort="Sebillot, Pascale" uniqKey="Sebillot P" first="Pascale" last="Sébillot">Pascale Sébillot</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-2538" status="OLD">
<idno type="RNSR">200218363F</idno>
<orgName>Multimedia content-based indexing</orgName>
<orgName type="acronym">TEXMEX</orgName>
<date type="start">2002-11-01</date>
<date type="end">2014-12-31</date>
<desc>
<address>
<addrLine>Campus de Beaulieu, 35042 Rennes cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/texmex</ref>
</desc>
<listRelation>
<relation active="#struct-419153" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-2494" type="direct"></relation>
<relation active="#struct-105160" type="indirect"></relation>
<relation active="#struct-117606" type="indirect"></relation>
<relation name="UMR6074" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-419153" type="direct">
<org type="laboratory" xml:id="struct-419153" status="VALID">
<idno type="RNSR">198018249C</idno>
<orgName>Inria Rennes – Bretagne Atlantique </orgName>
<desc>
<address>
<addrLine>Campus de beaulieu35042 Rennes cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/centre/rennes</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-2494" type="direct">
<org type="laboratory" xml:id="struct-2494" status="OLD">
<orgName>Institut de Recherche en Informatique et Systèmes Aléatoires</orgName>
<orgName type="acronym">IRISA</orgName>
<desc>
<address>
<addrLine>Campus universitaire de Beaulieu - 35042 Rennes</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.irisa.fr/</ref>
</desc>
<listRelation>
<relation active="#struct-105160" type="direct"></relation>
<relation active="#struct-117606" type="direct"></relation>
<relation active="#struct-300009" type="direct"></relation>
<relation name="UMR6074" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-105160" type="indirect">
<org type="institution" xml:id="struct-105160" status="VALID">
<orgName>Université de Rennes 1</orgName>
<orgName type="acronym">UR1</orgName>
<desc>
<address>
<addrLine>2 rue du Thabor - CS 46510 - 35065 Rennes cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rennes1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-117606" type="indirect">
<org type="institution" xml:id="struct-117606" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rennes</orgName>
<orgName type="acronym">INSA Rennes</orgName>
<desc>
<address>
<addrLine>20, avenue des Buttes de Coësmes - CS 70839 - 35708 Rennes cedex 7</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.insa-rennes.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR6074" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Rennes</settlement>
<region type="region" nuts="2">Région Bretagne</region>
</placeName>
<orgName type="university">Université de Rennes 1</orgName>
<orgName type="institution" wicri:auto="newGroup">Université européenne de Bretagne</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-00867225</idno>
<idno type="halId">hal-00867225</idno>
<idno type="halUri">https://hal.archives-ouvertes.fr/hal-00867225</idno>
<idno type="url">https://hal.archives-ouvertes.fr/hal-00867225</idno>
<idno type="doi">10.1007/s10032-013-0202-7</idno>
<date when="2014-03">2014-03</date>
<idno type="wicri:Area/Hal/Corpus">000116</idno>
<idno type="wicri:Area/Hal/Curation">000116</idno>
<idno type="wicri:Area/Hal/Checkpoint">000037</idno>
<idno type="wicri:Area/Main/Merge">000077</idno>
<idno type="wicri:Area/Main/Curation">000076</idno>
<idno type="wicri:Area/Main/Exploration">000076</idno>
<idno type="wicri:Area/France/Extraction">000036</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Text Recognition in Multimedia Documents: A Study of two Neural-based OCRs Using and Avoiding Character Segmentation</title>
<author>
<name sortKey="Elagouni, Khaoula" sort="Elagouni, Khaoula" uniqKey="Elagouni K" first="Khaoula" last="Elagouni">Khaoula Elagouni</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-121780" status="VALID">
<orgName>Orange Labs R&D [Rennes]</orgName>
<desc>
<address>
<addrLine>4 rue du Clos Courtel, 35112 Cesson-Sévigné Cedex, France</addrLine>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-300518" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-300518" type="direct">
<org type="institution" xml:id="struct-300518" status="VALID">
<orgName>France Télécom</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author>
<name sortKey="Garcia, Christophe" sort="Garcia, Christophe" uniqKey="Garcia C" first="Christophe" last="Garcia">Christophe Garcia</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-403930" status="VALID">
<orgName>Extraction de Caractéristiques et Identification</orgName>
<orgName type="acronym">imagine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-2003" type="direct"></relation>
<relation active="#struct-33804" type="indirect"></relation>
<relation active="#struct-126765" type="indirect"></relation>
<relation active="#struct-194495" type="indirect"></relation>
<relation name="- LYON" active="#struct-301232" type="indirect"></relation>
<relation name="UMR5205" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-2003" type="direct">
<org type="laboratory" xml:id="struct-2003" status="VALID">
<orgName>Laboratoire d'InfoRmatique en Image et Systèmes d'information</orgName>
<orgName type="acronym">LIRIS</orgName>
<desc>
<address>
<addrLine>Bâtiment Blaise Pascal - 20, avenue Albert Einstein - 69621 Villeurbanne cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://liris.cnrs.fr/</ref>
</desc>
<listRelation>
<relation active="#struct-33804" type="direct"></relation>
<relation active="#struct-126765" type="direct"></relation>
<relation active="#struct-194495" type="direct"></relation>
<relation name="- LYON" active="#struct-301232" type="direct"></relation>
<relation name="UMR5205" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-33804" type="indirect">
<org type="institution" xml:id="struct-33804" status="VALID">
<orgName>Université Lumière - Lyon 2</orgName>
<orgName type="acronym">UL2</orgName>
<desc>
<address>
<addrLine>86, rue Pasteur - 69007 Lyon</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lyon2.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-126765" type="indirect">
<org type="institution" xml:id="struct-126765" status="VALID">
<orgName>École Centrale de Lyon</orgName>
<orgName type="acronym">ECL</orgName>
<desc>
<address>
<addrLine>36 avenue Guy de Collongue - 69134 Ecully cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lyon.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-194495" type="indirect">
<org type="institution" xml:id="struct-194495" status="VALID">
<orgName>Université Claude Bernard Lyon 1</orgName>
<orgName type="acronym">UCBL</orgName>
<desc>
<address>
<addrLine>43, boulevard du 11 novembre 1918, 69622 Villeurbanne cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lyon1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="- LYON" active="#struct-301232" type="indirect">
<org type="institution" xml:id="struct-301232" status="VALID">
<orgName>Institut National des Sciences Appliquées</orgName>
<orgName type="acronym">INSA</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMR5205" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Lyon</settlement>
<region type="region" nuts="2">Auvergne-Rhône-Alpes</region>
<region type="old region" nuts="2">Rhône-Alpes</region>
</placeName>
<orgName type="university">Université Claude Bernard Lyon 1</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lyon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Mamalet, Franck" sort="Mamalet, Franck" uniqKey="Mamalet F" first="Franck" last="Mamalet">Franck Mamalet</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-121780" status="VALID">
<orgName>Orange Labs R&D [Rennes]</orgName>
<desc>
<address>
<addrLine>4 rue du Clos Courtel, 35112 Cesson-Sévigné Cedex, France</addrLine>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-300518" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-300518" type="direct">
<org type="institution" xml:id="struct-300518" status="VALID">
<orgName>France Télécom</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author>
<name sortKey="Sebillot, Pascale" sort="Sebillot, Pascale" uniqKey="Sebillot P" first="Pascale" last="Sébillot">Pascale Sébillot</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-2538" status="OLD">
<idno type="RNSR">200218363F</idno>
<orgName>Multimedia content-based indexing</orgName>
<orgName type="acronym">TEXMEX</orgName>
<date type="start">2002-11-01</date>
<date type="end">2014-12-31</date>
<desc>
<address>
<addrLine>Campus de Beaulieu, 35042 Rennes cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/texmex</ref>
</desc>
<listRelation>
<relation active="#struct-419153" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-2494" type="direct"></relation>
<relation active="#struct-105160" type="indirect"></relation>
<relation active="#struct-117606" type="indirect"></relation>
<relation name="UMR6074" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-419153" type="direct">
<org type="laboratory" xml:id="struct-419153" status="VALID">
<idno type="RNSR">198018249C</idno>
<orgName>Inria Rennes – Bretagne Atlantique </orgName>
<desc>
<address>
<addrLine>Campus de beaulieu35042 Rennes cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/centre/rennes</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-2494" type="direct">
<org type="laboratory" xml:id="struct-2494" status="OLD">
<orgName>Institut de Recherche en Informatique et Systèmes Aléatoires</orgName>
<orgName type="acronym">IRISA</orgName>
<desc>
<address>
<addrLine>Campus universitaire de Beaulieu - 35042 Rennes</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.irisa.fr/</ref>
</desc>
<listRelation>
<relation active="#struct-105160" type="direct"></relation>
<relation active="#struct-117606" type="direct"></relation>
<relation active="#struct-300009" type="direct"></relation>
<relation name="UMR6074" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-105160" type="indirect">
<org type="institution" xml:id="struct-105160" status="VALID">
<orgName>Université de Rennes 1</orgName>
<orgName type="acronym">UR1</orgName>
<desc>
<address>
<addrLine>2 rue du Thabor - CS 46510 - 35065 Rennes cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-rennes1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-117606" type="indirect">
<org type="institution" xml:id="struct-117606" status="VALID">
<orgName>Institut National des Sciences Appliquées - Rennes</orgName>
<orgName type="acronym">INSA Rennes</orgName>
<desc>
<address>
<addrLine>20, avenue des Buttes de Coësmes - CS 70839 - 35708 Rennes cedex 7</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.insa-rennes.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR6074" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Rennes</settlement>
<region type="region" nuts="2">Région Bretagne</region>
</placeName>
<orgName type="university">Université de Rennes 1</orgName>
<orgName type="institution" wicri:auto="newGroup">Université européenne de Bretagne</orgName>
</affiliation>
</author>
</analytic>
<idno type="DOI">10.1007/s10032-013-0202-7</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="mix" xml:lang="en">
<term>OCR</term>
<term>character segmentation</term>
<term>convolutional neural network</term>
<term>language model</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based OCRs that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Auvergne-Rhône-Alpes</li>
<li>Rhône-Alpes</li>
<li>Région Bretagne</li>
</region>
<settlement>
<li>Lyon</li>
<li>Rennes</li>
</settlement>
<orgName>
<li>Université Claude Bernard Lyon 1</li>
<li>Université de Lyon</li>
<li>Université de Rennes 1</li>
<li>Université européenne de Bretagne</li>
</orgName>
</list>
<tree>
<country name="France">
<noRegion>
<name sortKey="Elagouni, Khaoula" sort="Elagouni, Khaoula" uniqKey="Elagouni K" first="Khaoula" last="Elagouni">Khaoula Elagouni</name>
</noRegion>
<name sortKey="Garcia, Christophe" sort="Garcia, Christophe" uniqKey="Garcia C" first="Christophe" last="Garcia">Christophe Garcia</name>
<name sortKey="Mamalet, Franck" sort="Mamalet, Franck" uniqKey="Mamalet F" first="Franck" last="Mamalet">Franck Mamalet</name>
<name sortKey="Sebillot, Pascale" sort="Sebillot, Pascale" uniqKey="Sebillot P" first="Pascale" last="Sébillot">Pascale Sébillot</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/France/Analysis
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000036 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/France/Analysis/biblio.hfd -nk 000036 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    France
   |étape=   Analysis
   |type=    RBID
   |clé=     Hal:hal-00867225
   |texte=   Text Recognition in Multimedia Documents: A Study of two Neural-based OCRs Using and Avoiding Character Segmentation
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024